Which Appears Twice (Practice Interview Question)

You're out of free questions.

I have a list of $n + 1$ numbers. Every number in the range $1..n$ appears once except for one number that appears twice.

Write a function for finding the number that appears twice.

Gotchas

We can do this with $O(1)$ additional memory.

Breakdown

To avoid using up extra memory space, lets use some math!

Solution

First, we sum all numbers $1..n$ . We can do this using the equation:

\frac{n^2 + n}{2}

because the numbers in $1..n$ are a triangular series. ↴

A triangular series is a series of numbers where each number could be the row of an equilateral triangle.

So 1, 2, 3, 4, 5 is a triangular series, because you could stack the numbers like this:

Rows of 1, 2, 3, 4, and 5 circles to show how numbers in a triangular series can be stacked to form a triangle.

Their sum is 15, which makes 15 a triangular number.

A triangular series always starts with 1 and increases by 1 with each number.

Since the only thing that changes in triangular series is the value of the highest number, it’s helpful to give that a name. Let’s call it $n$ .

  # n is 8
1, 2, 3, 4, 5, 6, 7, 8

Triangular series are nice because no matter how large $n$ is, it’s always easy to find the total sum of all the numbers.

Take the example above. Notice that if we add the first and last numbers together, and then add the second and second-to-last numbers together, they have the same sum! This happens with every pair of numbers until we reach the middle. If we add up all the pairs of numbers, we get:

This is true for every triangular series:

Pairs of numbers on each side will always add up to the same value.
That value will always be 1 more than the series’ $n$ .

This gives us a pattern. Each pair's sum is $n+1$ , and there are $\frac{n}{2}$ pairs. So our total sum is: $(n + 1) * \frac{n}{2}$

Or:

\frac{n^2 + n}{2}

Ok, but does this work with triangular series with an odd number of elements? Yes. Let's say $n = 5$ . So if we calculate the sum by hand:

1+2+3+4+5=15

And if we use the formula, we get the same answer:

\frac{5^2 + 5}{2}=15

One more thing:

What if we know the total sum, but we don't know the value of $n$ ?

Let’s say we have:

1 + 2 + 3 + \ldots + (n - 2) + (n - 1) + n = 78

No problem. We just use our formula and set it equal to the sum!

\frac{n^2 + n}{2}=78

Now, we can rearrange our equation to get a quadratic equation (remember those?)

n^2 + n = 156

n^2 + n - 156 = 0

Here's the quadratic formula:

\frac{-b\pm\sqrt{b^2-4ac}}{2a}

If you don't really remember how to use it, that's cool. You can just use an online calculator. We don't judge.

Taking the positive solution, we get $n = 12$ .

So for a triangular series, remember—the total sum is:

\frac{n^2 + n}{2}

Second, we sum all numbers in our input list, which should be the same as our other sum but with our repeat number added in twice. So the difference between these two sums is the repeated number!

  def find_repeat(numbers_list):
    if len(numbers_list) < 2:
        raise ValueError('Finding duplicate requires at least two numbers')

    n = len(numbers_list) - 1
    sum_without_duplicate = (n * n + n) / 2

    actual_sum = sum(numbers_list)

    return actual_sum - sum_without_duplicate

Complexity

$O(n)$ time. We can sum all the numbers $1..n$ in $O(1)$ time using the fancy formula, but it still takes $O(n)$ time to sum all the numbers in our input list.

$O(1)$ additional space—the only additional space we use is for numbers to hold the sums with and without the repeated value.

Bonus

If our list contains huge numbers or is really long, our sum might be so big it causes an integer overflow. ↴

When you create an integer variable, your computer allocates a fixed number of bits for storing it. Most modern computers use 32 or 64 bits. But some numbers are so big they don't fit even in 64 bits, like sextillion (a billion trillion), which is 70 digits in binary.

Sometimes we might have a number that does fit in 32 or 64 bits, but if we add to it (or multiply it by something, or do another operation) the result might not fit in the original 32 or 64 bits. This is called an integer overflow.

For example, let's say we have just 2 bits to store integers with. So we can only hold the unsigned (non-negative) integers 0-3 in binary:

  00 (0)
01 (1)
10 (2)
11 (3)

What happens if we have 3 (11) and we try to add 1 (01)? The answer is 4 (100) but that requires 3 bits and we only have 2.

What happens next depends on the language:

Some languages, like Python or Ruby, will notice that the result won't fit and automatically allocate space for a larger number.
In other languages, like C or Java, the processor will sort of "do its best" with the bits it has, taking the true result and throwing out any bits that don't fit. So in our example above, when adding 01 to 11, the processor would take the true result 100 and throw out the highest bit, leaving 00.
Swift will throw an error if an integer overflows, unless you've explicitly indicated that overflowing integers should be truncated to fit (like in C or Java).

In languages where integer overflow can occur, you can reduce its likelihood by using larger integer types, like C's long long int or Java's long. If you need to store something even bigger, there are libraries built to handle arbitrarily large numbers.

In some languages, you can also take advantage of overflow-checking features provided by the compiler or interpreter.

What are some ways to protect against this?

Do you have an answer?

Show answer

I have an answer Tell me more

Wanna review this one again later? Or do you feel like you got it all?

Mark as done Pin for review later